A Theory of Probabilistic Boosting, Decision Trees and Matryoshki

نویسنده

  • Etienne Grossmann
چکیده

We present a theory of boosting probabilistic classifiers. We place ourselves in the situation of a user who only provides a stopping parameter and a probabilistic weak learner/classifier and compare three types of boosting algorithms: probabilistic Adaboost, decision tree, and tree of trees of ... of trees, which we call matryoshka. “Nested tree,” “embedded tree” and “recursive tree” are also appropriate names for this algorithm, which is one of our contributions. Our other contribution is the theoretical analysis of the algorithms, in which we give training error bounds. This analysis suggests that the matryoshka leverages probabilistic weak classifiers more efficiently than simple decision trees.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Decision Trees and Forests: A Probabilistic Perspective

Decision trees and ensembles of decision trees are very popular in machine learning and often achieve state-of-the-art performance on black-box prediction tasks. However, popular variants such as C4.5, CART, boosted trees and random forests lack a probabilistic interpretation since they usually just specify an algorithm for training a model. We take a probabilistic approach where we cast the de...

متن کامل

Outlier Detection by Boosting Regression Trees

A procedure for detecting outliers in regression problems is proposed. It is based on information provided by boosting regression trees. The key idea is to select the most frequently resampled observation along the boosting iterations and reiterate after removing it. The selection criterion is based on Tchebychev’s inequality applied to the maximum over the boosting iterations of ...

متن کامل

On the Practice of Branching Program Boosting

Branching programs are a generalization of decision trees. From the viewpoint of boosting theory the former appear to be exponentially more eecient. However, earlier experience demonstrates that such results do not necessarily translate to practical success. In this paper we develop a practical version of Mansour and McAllester's 13] algorithm for branching program boosting. We test the algorit...

متن کامل

Improving reservoir rock classification in heterogeneous carbonates using boosting and bagging strategies: A case study of early Triassic carbonates of coastal Fars, south Iran

An accurate reservoir characterization is a crucial task for the development of quantitative geological models and reservoir simulation. In the present research work, a novel view is presented on the reservoir characterization using the advantages of thin section image analysis and intelligent classification algorithms. The proposed methodology comprises three main steps. First, four classes of...

متن کامل

Probabilistic analysis of the asymmetric digital search trees

In this paper, by applying three functional operators the previous results on the (Poisson) variance of the external profile in digital search trees will be improved. We study the profile built over $n$ binary strings generated by a memoryless source with unequal probabilities of symbols and use a combinatorial approach for studying the Poissonized variance, since the probability distribution o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/cs/0607110  شماره 

صفحات  -

تاریخ انتشار 2006